18F-FDG PET/CT based model for predicting malignancy in pulmonary nodules: a meta-analysis

Background Several studies to date have reported on the development of positron emission tomography (PET)/computed tomography (CT)-based models intended to effectively distinguish between benign and malignant pulmonary nodules (PNs). This meta-analysis was designed with the goal of clarifying the utility of these PET/CT-based conventional parameter models as diagnostic tools in the context of the differential diagnosis of PNs. Methods Relevant studies published through September 2023 were identified by searching the Web of Science, PubMed, and Wanfang databases, after which Stata v 12.0 was used to conduct pooled analyses of the resultant data. Results This meta-analysis included a total of 13 retrospective studies that analyzed 1,731 and 693 malignant and benign PNs, respectively. The respective pooled sensitivity, specificity, PLR, and NLR values for the PET/CT-based studies developed in these models were 88% (95%CI: 0.86–0.91), 78% (95%CI: 0.71–0.85), 4.10 (95%CI: 2.98–5.64), and 0.15 (95%CI: 0.12–0.19). Of these endpoints, the pooled analyses of model sensitivity (I2 = 69.25%), specificity (I2 = 78.44%), PLR (I2 = 71.42%), and NLR (I2 = 67.18%) were all subject to significant heterogeneity. The overall area under the curve value (AUC) value for these models was 0.91 (95%CI: 0.88–0.93). When differential diagnosis was instead performed based on PET results only, the corresponding pooled sensitivity, specificity, PLR, and NLR values were 92% (95%CI: 0.85–0.96), 51% (95%CI: 0.37–0.66), 1.89 (95%CI: 1.36–2.62), and 0.16 (95%CI: 0.07–0.35), with all four being subject to significant heterogeneity (I2 = 88.08%, 82.63%, 80.19%, and 86.38%). The AUC for these pooled analyses was 0.82 (95%CI: 0.79–0.85). Conclusions These results suggest that PET/CT-based models may offer diagnostic performance superior to that of PET results alone when distinguishing between benign and malignant PNs.


Introduction
Pulmonary nodules (PNs) are small (≤ 3 cm) lesions surrounded by lung parenchymal tissue that are not transparent and not the results of atelectasis, mediastinal lymphadenopathy, or pleural effusion [1][2][3].In cases where these nodules are > 6 mm in size, computed tomography (CT)-based routine follow-up is warranted [4], with a 1.1-fold increase in the risk of PN malignancy with each 1 mm increase in diameter [5].Analyses of patient clinical data and CT imaging findings are the most commonly used approach to PN diagnosis [6][7][8].
CT features often indicative of PN malignancy include CT bronchus sign, vascular convergence sign, pleural retraction, lobulation, and spiculated sign [6][7][8].Clinical risk factors for PN malignancy include more advanced age, elevated serum levels of tumor marker proteins, and a history of smoking [6,9].Researchers have devised an array of predictive models based on these clinical and imaging features with the goal of more reliably identifying malignant PNs [6][7][8].Most CT-derived imaging features, however, are classified as binary variables that can be inconsistently identified based on the experience level of the attending physician.More reliable quantitative imaging strategies are thus needed to minimize this potential for bias, thereby increasing the odds of accurately diagnosing PNs.18 F-fludeoxyglucose ( 18 F-FDG) positron emission tomography (PET)/CT scans have emerged as a powerful approach to PN diagnosis, with standardized maximum uptake values (SUV max ) serving as a proxy for radiotracer uptake on imaging scans [10].Given these advantages, researchers have also incorporated PET/CT imaging parameters into predictive models designed to diagnose PNs in an effort to achieve superior accuracy [11][12][13][14][15][16][17][18][19][20][21][22][23].However, there has been substantial variability among studies with respect to the purported diagnostic performance of these individual PET/CT-based models [11][12][13][14][15][16][17][18][19][20][21][22][23].There thus remains the pressing need for large-scale analyses capable of systematically clarifying the diagnostic utility of the models developed to date.
Accordingly, the present meta-analysis was conducted to clarify the diagnostic performance of PET/CT-based models when used for the differential diagnosis of potentially malignant PNs.
To be eligible for inclusion, studies had to be: (1) focused on the differential diagnosis of malignant or benign PNs, (2) centered on the development or testing of PET/CT-based models that were provided within the study, and (3) transparent with respect to the true positive (TP), true negative (TN), false positive (FP), and false negative (FN) values associated with the tested models.provided.Case reports, non-human studies, and reviews were excluded from this study.

Data extraction and quality analyses
Two investigators were responsible for independently extracting pertinent data from these studies, including baseline study data, baseline patient data, and the results of diagnostic analyses.Any discrepancies were resolved by a third investigator.The QUADAS-2 tool was used to gauge risk of bias [24].

Definitions
TP results were those for which both PET/CT-based models and final diagnoses were indicative of PN malignancy, whereas FP results were those for which PET/CTbased models predicted that a given lesion was malignant but it was ultimately found to be benign.Conversely, TN results were those for which both PET/CT-based models and final diagnoses indicated that a PN was benign, whereas FN results were those for which PET/CT-based models predicted that a given lesion was benign but it was ultimately found to be malignant.

Meta-analysis
Stata v 12.0 (Stata Corporation, TX, USA) was used to compute pooled sensitivity, specificity, diagnostic score, negative likelihood ratio (NLR), positive likelihood ratio (PLR), and summary receiver operating characteristic (SROC) curves for this study.A given predictive model was considered to exhibit high diagnostic performance if it exhibited an NLR < 0.2 or a PLR > 5.An area under the SROC curve (AUC) value greater than 0.8 was also considered to indicate a high degree of diagnostic utility [3].RevMan v 5.3 was used to compare pooled SUV max values between benign and malignant PNs.I 2 values were employed to gauge the degree of heterogeneity, with I 2 > 50% indicating that such heterogeneity was significant.The possibility of publication bias was assessed with Deeks' funnel plots, and P < 0.05 served as the threshold for defining statistical significance.

Study selection
The initial search strategy returned 526 studies of which 13 were found to be relevant and incorporated into the final analyses (Fig. 1).These 13 studies were retrospective in design, and included 11 and 2 studies respectively conducted in China and Spain.For further study-specific details, see Table 1.
A total of 1,731 and 693 malignant and benign PNs were ultimately included in these studies.Numbers of predictors included in individual predictive models ranged from 2 to 7 (Table 2).Except for PET/CT, age was the predictor in 12 of the 13 models.The common malignant CT features, such as lobulation, spiculation, and pleural retraction, occurred in 6, 5, and 3 models.Different models could provide different performances and therefore a different number of TP, TN, FP, FN.For details regarding raw TP, FP, TN, and FN data, see Table 3.

SUV max values
The mean SUV max values for benign and malignant PNs were reported in 4 total studies [13,15,20,21].
Significantly higher pooled SUV max values were observed for malignant PNs as compared to benign nodules (P < 0.00001, Fig. 5a), although significant heterogeneity was detected (I 2 = 60%).Sensitivity analyses suggested that the study conducted by Liu et al. [16] was the greatest source of heterogeneity, but even with the removal of this study the pooled SUV max of malignant PNs remained higher than that of benign PNs (P < 0.00001).Funnel plots revealed a low risk of publication bias (Fig. 5b).

Discussion
The present meta-analysis explored the performance of PET/CT-based models as tools for the differential diagnosis of PNs.The overall pooled AUC value of 0.91 was indicative of excellent predictive performance in this context, while the low NLR value (0.15) demonstrates that these PET/CT-based models can satisfactorily diagnose benign PNs when predictive scores fall below the established cut-off value.As the pooled PLR value of 4.10 was less than 5, however, this suggests that the diagnostic ability of these PET/CT-based models for malignant PNs is only moderate when predictive scores fall above the established cut-off value.PET/CT imaging can yield both CT images that offer morphological insight regarding a given lesion, as well PET images capable of quantifying glucose metabolism rates.PET scans thus enable the detection of malignant lesions composed of highly metabolically active cells, given that they take up 18 F-FDG and glucose at higher rates than do benign cells [25,26].In the present meta-analysis, a significantly higher pooled SUV max value was exhibited by malignant PNs as compared to benign PNs.
The diagnostic utility of individual CT features is relatively limited when evaluating PNs.In prior meta-analyses assessing the diagnostic performance of lobulation sign, calcification, and spiculation as approaches to differential diagnosis of PNs, the AUC values were between 0.65 and 0.76 [1][2][3].The AUC for the diagnostic utility of PET alone in the present study was 0.82, but the pooled specificity was just 51%.High levels of 18 F-FDG uptake can also be observed for benign inflammatory, infectious, or granulatomous disease-associated lesions [27], contributing to a relatively low PLR of 1.89.The comparison of diagnostic performance between the predictive model and PET alone suggests that the diagnostic ability of PET alone is limited when evaluating PNs, emphasizing the need to combine multiple signs in an effort to improve the performance of diagnostic models.
There are many advantages to utilizing mathematical models when diagnosing PNs.Notably, these models can ensure that patients can be assessed in a more objective manner, yielding a predictive score reflective of the odds of PN malignancy.In addition, these models can provide risk coefficients for all predictive factors incorporated The Mayo model was the first predictive model designed to distinguish between benign and malignant PNs [28].Herder et al. [29] combined the Mayo model with PET results to establish the first PET/CT-based model, which exhibited an AUC of 0.92 in line with the pooled AUC measured in the present meta-analysis.This AUC value was also higher than that of the Mayo model (0.79) or PET scanning results alone (0.88) [29].
In addition to imaging features, predictive models can also incorporate levels of tumor markers or particular clinical features [3].More advanced age and higher serum concentrations of carcinoembryonic antigen have both been linked to a greater risk of PN malignancy [3,9].While age was a factor that was included in most predictive analyses analyzed herein, none incorporated tumor markers.Additional research focused on developing new PET/CT-based predictive models incorporating clinical characteristics, imaging features, and tumor marker levels are thus warranted to improve diagnostic accuracy.
This meta-analysis is subject to certain limitations.For one, as all included studies were retrospective in design, these findings are subject to a high risk of bias.Moreover, many of the included studies failed to indicate whether were recruited consecutively, and this oversight may have influenced the diagnostic performance of the models developed in individual studies.Next, different models contained different predictive factors, and the diagnostic results were not only influenced by PET/CT, but also influenced by other factors.However, different models also have the similarity that the predictive models can provide the comprehensive and quanitative analysis for the PNs.Lastly, the included studies did not utilize consistent reference standards, again potentially impacting the resultant diagnostic accuracy.

Conclusions
In summary, PET/CT-based models appear to exhibit promising diagnostic performance when used to distinguish between benign and malignant PNs, outperforming PET-derived SUV max values alone when employed for the differential diagnosis of PNs.

Fig. 1
Fig.1The study selection process for this meta-analysis

Fig. 2 (
Fig. 2 (A) The quality assessment of each included study.(B) The summary of the quality assessment

Fig. 5
Fig. 5 (a) The forest plot of the pooled SUV max values between malignant and benign PNs.(b) The assessment of the publication bias of SUVmax values

Table 1
Characteristics of studies included in meta-

analysis Studies Year Country Blind Sample size Male/Female Age (y) Malignant/Benign Reference standard
B: biopsy; F: follow-up; S: surgery

Table 2
The details of each predictive model PET: positron emission tomography

Table 3
Raw Data of diagnostic performance of studies included in this meta-analysis FN: false negative; FP: false positive; PET/CT: positron emission tomography/computed tomography; TN: true negative; TP: true positive Li et al.Journal of Cardiothoracic Surgery (2024) 19:148